{
  "cells": [
    {
      "cell_type": "raw",
      "metadata": {},
      "source": [
        "---\n",
        "sidebar_position: 0\n",
        "---"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Build a Question Answering application over a Graph Database\n",
        "\n",
        "In this guide we'll go over the basic ways to create a Q&A chain over a graph database. These systems will allow us to ask a question about the data in a graph database and get back a natural language answer.\n",
        "\n",
        "## ⚠️ Security note ⚠️\n",
        "\n",
        "Building Q&A systems of graph databases requires executing model-generated graph queries. There are inherent risks in doing this. Make sure that your database connection permissions are always scoped as narrowly as possible for your chain/agent's needs. This will mitigate though not eliminate the risks of building a model-driven system. For more on general security best practices, [see here](/docs/security).\n",
        "\n",
        "\n",
        "## Architecture\n",
        "\n",
        "At a high-level, the steps of most graph chains are:\n",
        "\n",
        "1. **Convert question to a graph database query**: Model converts user input to a graph database query (e.g. Cypher).\n",
        "2. **Execute graph database query**: Execute the graph database query.\n",
        "3. **Answer the question**: Model responds to user input using the query results.\n",
        "\n",
        "\n",
        "![sql_usecase.png](../../static/img/graph_usecase.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Setup\n",
        "#### Install dependencies\n",
        "\n",
        "```{=mdx}\n",
        "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n",
        "import Npm2Yarn from \"@theme/Npm2Yarn\";\n",
        "\n",
        "<IntegrationInstallTooltip></IntegrationInstallTooltip>\n",
        "\n",
        "<Npm2Yarn>\n",
        "  langchain @langchain/community @langchain/openai neo4j-driver\n",
        "</Npm2Yarn>\n",
        "```\n",
        "\n",
        "#### Set environment variables\n",
        "\n",
        "We'll use OpenAI in this example:\n",
        "\n",
        "```env\n",
        "OPENAI_API_KEY=your-api-key\n",
        "\n",
        "# Optional, use LangSmith for best-in-class observability\n",
        "LANGSMITH_API_KEY=your-api-key\n",
        "LANGCHAIN_TRACING_V2=true\n",
        "```\n",
        "\n",
        "Next, we need to define Neo4j credentials.\n",
        "Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database.\n",
        "\n",
        "```env\n",
        "NEO4J_URI=\"bolt://localhost:7687\"\n",
        "NEO4J_USERNAME=\"neo4j\"\n",
        "NEO4J_PASSWORD=\"password\"\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Schema refreshed successfully.\n"
          ]
        },
        {
          "data": {
            "text/plain": [
              "[]"
            ]
          },
          "execution_count": 3,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import \"neo4j-driver\";\n",
        "import { Neo4jGraph } from \"@langchain/community/graphs/neo4j_graph\";\n",
        "\n",
        "const url = Deno.env.get(\"NEO4J_URI\");\n",
        "const username = Deno.env.get(\"NEO4J_USER\");\n",
        "const password = Deno.env.get(\"NEO4J_PASSWORD\");\n",
        "const graph = await Neo4jGraph.initialize({ url, username, password });\n",
        "\n",
        "// Import movie information\n",
        "const moviesQuery = `LOAD CSV WITH HEADERS FROM \n",
        "'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n",
        "AS row\n",
        "MERGE (m:Movie {id:row.movieId})\n",
        "SET m.released = date(row.released),\n",
        "    m.title = row.title,\n",
        "    m.imdbRating = toFloat(row.imdbRating)\n",
        "FOREACH (director in split(row.director, '|') | \n",
        "    MERGE (p:Person {name:trim(director)})\n",
        "    MERGE (p)-[:DIRECTED]->(m))\n",
        "FOREACH (actor in split(row.actors, '|') | \n",
        "    MERGE (p:Person {name:trim(actor)})\n",
        "    MERGE (p)-[:ACTED_IN]->(m))\n",
        "FOREACH (genre in split(row.genres, '|') | \n",
        "    MERGE (g:Genre {name:trim(genre)})\n",
        "    MERGE (m)-[:IN_GENRE]->(g))`\n",
        "\n",
        "await graph.query(moviesQuery);"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Graph schema\n",
        "\n",
        "In order for an LLM to be able to generate a Cypher statement, it needs information about the graph schema. When you instantiate a graph object, it retrieves the information about the graph schema. If you later make any changes to the graph, you can run the `refreshSchema` method to refresh the schema information."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {},
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Node properties are the following:\n",
            "Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING}, Person {name: STRING}, Genre {name: STRING}\n",
            "Relationship properties are the following:\n",
            "\n",
            "The relationships are the following:\n",
            "(:Movie)-[:IN_GENRE]->(:Genre), (:Person)-[:DIRECTED]->(:Movie), (:Person)-[:ACTED_IN]->(:Movie)\n"
          ]
        }
      ],
      "source": [
        "await graph.refreshSchema()\n",
        "console.log(graph.schema)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Great! We've got a graph database that we can query. Now let's try hooking it up to an LLM.\n",
        "\n",
        "## Chain\n",
        "\n",
        "Let's use a simple chain that takes a question, turns it into a Cypher query, executes the query, and uses the result to answer the original question.\n",
        "\n",
        "![graph_chain.webp](../../static/img/graph_chain.webp)\n",
        "\n",
        "\n",
        "LangChain comes with a built-in chain for this workflow that is designed to work with Neo4j: [GraphCypherQAChain](https://python.langchain.com/docs/use_cases/graph/graph_cypher_qa)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{ result: \u001b[32m\"James Woods, Joe Pesci, Robert De Niro, Sharon Stone\"\u001b[39m }"
            ]
          },
          "execution_count": 5,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import { GraphCypherQAChain } from \"langchain/chains/graph_qa/cypher\";\n",
        "import { ChatOpenAI } from \"@langchain/openai\";\n",
        "\n",
        "const llm = new ChatOpenAI({ model: \"gpt-3.5-turbo\", temperature: 0 })\n",
        "const chain = GraphCypherQAChain.fromLLM({\n",
        "  llm,\n",
        "  graph,\n",
        "});\n",
        "const response = await chain.invoke({ query: \"What was the cast of the Casino?\" })\n",
        "response"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Next steps\n",
        "\n",
        "For more complex query-generation, we may want to create few-shot prompts or add query-checking steps. For advanced techniques like this and more check out:\n",
        "\n",
        "* [Prompting strategies](/docs/how_to/graph_prompting): Advanced prompt engineering techniques.\n",
        "* [Mapping values](/docs/how_to/graph_mapping/): Techniques for mapping values from questions to database.\n",
        "* [Semantic layer](/docs/how_to/graph_semantic): Techniques for working implementing semantic layers.\n",
        "* [Constructing graphs](/docs/how_to/graph_constructing): Techniques for constructing knowledge graphs.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Deno",
      "language": "typescript",
      "name": "deno"
    },
    "language_info": {
      "file_extension": ".ts",
      "mimetype": "text/x.typescript",
      "name": "typescript",
      "nb_converter": "script",
      "pygments_lexer": "typescript",
      "version": "5.4.3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 4
}
